Somi Thoughts

Thought 007: Black box

Hi Somi, A large chunk of new papers in my field are using machine learning models. This seems to be happening in many STEM fields. These papers typically apply standard machine learning methods to a specific research problem with complex relationships. Most of the work involves finding some dataset, processing it, and writing lengthy descriptions of the model architecture, which in reality consists of off-the-shelf sklearn layers manually overfitted through dozens, if not hundreds, of iterations of hyperparameter tuning to get the best result.

The most annoying thing is that these methods actually work. They give better results than older non-machine learning methods, even with flawed methodology. The thing is, despite their results, they do not contribute to understanding anything and are likely detrimental to our understanding.

Let’s say you want to understand how a particular complex physical phenomenon works, such as fluid flow in microfluidics or how a thunder path is influenced by air density distribution. You could try to find the mathematical laws that describe it, search for simplified analytical solutions, numerically simulate a model using some PDEs, and see how close you get to some ground truth. Or you could forgo all that, generate a large dataset, and put it in a machine learning black box to get some good predictions on the complex input. The first method actually pursues understanding of the phenomena, while the black box approach does not. More importantly, machine learning methods make subsequent useful research more difficult.

Imagine how your favorite research field would look if modern machine learning methods were used in its infancy. For instance, if you could simulate fluid flow semi-accurately for most real-life setups using a gigantic model trained on millions of videos of fluid flow, if this was available prior to any theoretical research on fluids, would researchers have even tried to find the Navier-Stokes equations and build methods to robustly solve PDEs for varying phenomena? Or would most research focus on incrementally outcompeting the machine learning model using newer models and larger datasets for more esoteric inputs?

Let’s say a genius comes along, discovers Navier-Stokes and appropriate numerical solutions, and brings it to the level of accuracy of the current field of CFD. Unless you can accurately model the complex setups in the gigantic validation set with fluids of varying viscosity, complex boundaries, solid-liquid interactions, etc. you’ll get terrible results compared to the machine learning models. Would that kind of work even be seen as revolutionary? I would hope so, but seeing how current novel methods are being evaluated against machine learning models makes me doubtful.

Right now, many research fields are in their infancy phase. They are still trying to be modeled, and relationships between varying components are being resolved analytically and numerically. Machine learning will make it harder and harder to justify that kind of research, as long as the reward and evaluation of machine learning methods are compared with non-machine learning methods.

Most of the researchers around me tend to agree that non-machine learning methods, even when less accurate, are more important and contribute more to our understanding than their machine learning counterparts, but still treat these two styles of research the same during the evaluations stage.

So I propose we split up machine learning papers from non-machine learning methods in different categories, even different journals. To be clear, I’m not talking about papers in the machine learning field itself that find new architectures or reward functions. I’m talking about the applications of these models to various STEM and medical fields. We should see them as what they are: black boxes which, while useful for certain applications, do not contribute to our understanding of what they are modeling.

Goodnight Somi.

(btw I heard about explainable AI being a possible solution, but honestly, extracting meaningful insights from these models seems like a meme to me, but who knows maybe it can be done)